gh-135871: Reload lock internal state while spinning in PyMutex_LockTimed#146064
Open
dpdani wants to merge 1 commit intopython:mainfrom
Open
gh-135871: Reload lock internal state while spinning in PyMutex_LockTimed#146064dpdani wants to merge 1 commit intopython:mainfrom
PyMutex_LockTimed#146064dpdani wants to merge 1 commit intopython:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds atomic loads in the slow path of
PyMutexto increase the number of lock acquisitions per second that threads can make on a shared mutex.The tricky part here is to avoid degrading the performance when the lock is highly contended; that is, when many threads are trying to acquire the mutex at a high frequency. This is because the current strategy of never reloading the mutex's state is, maybe counter-intuitively, the best strategy: avoid disturbing the thread that currently holds the mutex.
I've run the lockbench script to assess the performance using the following two scenarios, which were suggested by @colesbury:
The results are from my M4 MacBook, and Python was compiled with
--disable-gil --enable-optimizations --with-lto.The different lines below represent different reloading strategies:
mainifstatement at line 107RELOAD_SPIN_COUNT = ...: tweak the value at line 30The
RELOAD_SPIN_COUNT = ...strategies also pseudo-randomize the reload recurrence by having each thread add its own thread-id to the counter. On my machine this nets a ~15/20% improvement over the same strategy, without this pseudo-randomization.And here's the comparison between
mainand this PR, which picks theRELOAD_SPIN_COUNT = 3strategy:The fact that the high contention case shows as an improvement in this chart is misleading: I'd say it's in the noise range.
_PyMutex_LockTimedspins and may fail unnecessarily in no-GIL builds #135871